[data] feat: add support for swe rebench v2 by yyDing1 · Pull Request #56 · verl-project/uni-agent

yyDing1 · 2026-06-07T17:16:15Z

What does this PR do?

as title

Checklist Before Starting

Search for similar PRs or issues and paste at least one relevant link here: ...
Format the PR title as [{modules}] {type}: {description} (checked by CI)
- {modules} may include core, interaction, model, env, tools, deployment, reward, dashboard, docs, examples, data, train, ci, build, deps, misc
- If this PR involves multiple modules, separate them with , like [interaction, tools, docs]
- {type} must be one of feat, fix, refactor, chore, test
- If this PR breaks an API, config contract, workflow, or other compatibility boundary, add [BREAKING] to the beginning of the title
- For a stacked PR series, you may prepend a progress marker such as [1/N]
- Example: [BREAKING][deployment, docs] feat: simplify runtime env configuration

Test

List the checks you ran. If CI coverage is not practical for this change, describe the manual validation or experiment results.

API and Usage Example

Show any public interface changes or updated usage examples if relevant.

# Add a short example here when the PR changes public behavior

Design & Code Changes

Summarize the approach for non-trivial changes and call out important implementation details or trade-offs.

Checklist Before Submitting

Read the Contribute Guide
Run pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add or update docs/examples for user-facing changes
Add tests or explain why tests are not practical
Confirm the PR title matches the required format
Confirm the placeholder text in this template has been replaced with real content

gemini-code-assist

Code Review

This pull request introduces support for the nebius/SWE-rebench-V2 dataset (Python slice), including a preprocessing script, a dedicated reward spec, and test-log parsers. Feedback on these changes highlights several robustness issues: a potential NameError in the preprocessing script when no languages are specified, potential KeyError and IndexError exceptions when parsing metadata, and a potential AttributeError if a patch is not a string. Additionally, immediately stripping whitespace from command observations in env.py could strip crucial formatting like indentation or tracebacks, so it is recommended to only strip during the empty check.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-07T17:18:24Z

+    if languages:
+        wanted = {lang.lower() for lang in languages}
+        dataset = dataset.filter(lambda ex: (ex.get("language") or "").lower() in wanted)
+    dataset = dataset.filter(lambda ex: ex["instance_id"] not in SKIP_SAMPLES)
+
+    print(f"Kept {len(dataset)} instances after language filter {sorted(wanted)}", flush=True)


If languages is falsy or None, the wanted variable is never defined. This will raise a NameError: name 'wanted' is not defined when printing the log on line 194. Defining wanted beforehand or conditionally formatting the log message resolves this issue.

wanted = None if languages: wanted = {lang.lower() for lang in languages} dataset = dataset.filter(lambda ex: (ex.get("language") or "").lower() in wanted) dataset = dataset.filter(lambda ex: ex["instance_id"] not in SKIP_SAMPLES) filter_desc = sorted(wanted) if wanted else "None" print(f"Kept {len(dataset)} instances after language filter {filter_desc}", flush=True)

gemini-code-assist · 2026-06-07T17:18:24Z

+            "log_parser": install_config["log_parser"],
+            "test_cmd": install_config["test_cmd"],


If example["install_config"] is None, install_config defaults to an empty dictionary {}. Accessing install_config["log_parser"] and install_config["test_cmd"] directly will then raise a KeyError. Using .get() is safer.

Suggested change

"log_parser": install_config["log_parser"],

"test_cmd": install_config["test_cmd"],

"log_parser": install_config.get("log_parser"),

"test_cmd": install_config.get("test_cmd"),

gemini-code-assist · 2026-06-07T17:18:24Z

+        # Upstream clones into /{repo.split('/')[1]} (see combine.Dockerfile.j2).
+        self.repo_dir = f"/{self.repo.split('/')[1]}"


If self.repo is not in the expected owner/repo format (e.g., if it is just a single repository name), self.repo.split('/') will only have one element, causing an IndexError when accessing index 1. Adding a defensive check prevents potential crashes.

Suggested change

# Upstream clones into /{repo.split('/')[1]} (see combine.Dockerfile.j2).

self.repo_dir = f"/{self.repo.split('/')[1]}"

# Upstream clones into /{repo.split('/')[1]} (see combine.Dockerfile.j2).

repo_parts = self.repo.split("/")

self.repo_dir = f"/{repo_parts[1]}" if len(repo_parts) > 1 else f"/{self.repo}"

gemini-code-assist · 2026-06-07T17:18:24Z

+def _get_modified_files(patch: str) -> list[str]:
+    """Target (``b/``) paths touched by a unified diff, order-preserved & deduped.
+
+    A tiny self-contained replacement for ``swebench.harness.utils.get_modified_files``
+    (only used to reset the test files before/after applying the test patch).
+    Files deleted by the patch (``+++ /dev/null``) are skipped.
+    """
+    files: list[str] = []
+    for line in patch.split("\n"):


If patch is None or not a string, calling patch.split("\n") will raise an AttributeError. Adding a defensive type check ensures the function handles unexpected or empty inputs gracefully.

Suggested change

def _get_modified_files(patch: str) -> list[str]:

"""Target (``b/``) paths touched by a unified diff, order-preserved & deduped.

A tiny self-contained replacement for ``swebench.harness.utils.get_modified_files``

(only used to reset the test files before/after applying the test patch).

Files deleted by the patch (``+++ /dev/null``) are skipped.

"""

files: list[str] = []

for line in patch.split("\n"):

def _get_modified_files(patch: str) -> list[str]:

"""Target (b/) paths touched by a unified diff, order-preserved & deduped.

A tiny self-contained replacement for swebench.harness.utils.get_modified_files

(only used to reset the test files before/after applying the test patch).

Files deleted by the patch (+++ /dev/null) are skipped.

"""

if not patch or not isinstance(patch, str):

return []

files: list[str] = []

for line in patch.split("\n"):

gemini-code-assist · 2026-06-07T17:18:24Z

+            observation = re.sub(r"\x1b\[[0-9;]*m|\r", "", observation).strip()
+            if observation == "":


Calling .strip() immediately on the observation modifies the returned command output by removing all leading and trailing whitespace/newlines. This can affect formatting (e.g., indentation in file content or traceback outputs) when presented to the agent. It is safer to preserve the original whitespace and only strip it for the empty check.

Suggested change

observation = re.sub(r"\x1b\[[0-9;]*m|\r", "", observation).strip()

if observation == "":

observation = re.sub(r"\x1b\[[0-9;]*m|\r", "", observation)

if observation.strip() == "":

yyDing1 added 6 commits June 7, 2026 17:26

update

6f7c939

update

39ca443

update

a124886

update

9c24d7b

update

9e19efc

update

66c26ca

yyDing1 merged commit 5b7a475 into main Jun 7, 2026
3 checks passed

yyDing1 deleted the swe-rebench-v2 branch June 7, 2026 17:17

gemini-code-assist Bot reviewed Jun 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[data] feat: add support for swe rebench v2#56

[data] feat: add support for swe rebench v2#56
yyDing1 merged 6 commits into
mainfrom
swe-rebench-v2

yyDing1 commented Jun 7, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 7, 2026

Uh oh!

gemini-code-assist Bot Jun 7, 2026

Uh oh!

gemini-code-assist Bot Jun 7, 2026

Uh oh!

gemini-code-assist Bot Jun 7, 2026

Uh oh!

gemini-code-assist Bot Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		"log_parser": install_config["log_parser"],
		"test_cmd": install_config["test_cmd"],

		# Upstream clones into /{repo.split('/')[1]} (see combine.Dockerfile.j2).
		self.repo_dir = f"/{self.repo.split('/')[1]}"

		observation = re.sub(r"\x1b\[[0-9;]*m\|\r", "", observation).strip()
		if observation == "":

Conversation

yyDing1 commented Jun 7, 2026

What does this PR do?

Checklist Before Starting

Test

API and Usage Example

Design & Code Changes

Checklist Before Submitting

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant